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DN A Repair 

Field of Invention 

The invention relates to products and methods for use in identifying 
5 mismatched oligonucleotides in at least a fragment of genetic material for use, 
particularly but not exclusively, in the diagnosis of genetic disease. 

The diagnosis of genetic disease using DNA markers linked to disease genes 
or, increasingly, the direct analysis of the relevant gene for mutations of the 
DNA sequence is of enormous clinical importance. If it were possible to 

10 simplify the detection of DNA sequence mutations a common approach could 
be adopted for the diagnosis of, or prediction of, predisposition to a wide 
range of inherited and acquired conditions. The most sensitive method 
currently available for the detection of mutations is probably that of Chemical 
Cleavage of Mutations (CCM). This method can theoretically detect all 

15 mismatches and small insertions and deletions and provide an approximate 
location of the mismatch. However, this approach is cumbersome and 
difficult to use in practice; still requires confirmation by DNA sequencing; 
and to batch large sample numbers this method would become very time- 
consuming. New methods are now being investigated based on the use of 

20 proof-reading enzymes or repair enzymes where function in vitro is the 
recognition of mismatches. 

Just one example of a disease characterised by a high incidence of mutation 
is hereditary non-poly posis colorectal cancer (HNPCC). This is one of the 
most common human autosomal dominant diseases. Estimations are that up 
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which can destroy the normal functioning of critical genes and lead to tumour 
formation (6). 

MutS has been used in methods involving proof-reading or repair enzymes. 
Binding of this enzyme to mismatches has been shown to be sensitive and 
5 specific (7). However, MutS which has been expressed in bacterial systems 
and purified is unstable. 

The cDNA of hMSH-2 is known (12) furthermore hMSH-2 has been shown 
to bind to DNA containing nucleotide mismatches in vitro (8,9). However, 
it is difficult to achieve acceptable levels of expression of hMSH-2 in 

10 expression systems. Our own work to express hMSH-2 in baculovirus 
expression systems has not been successful. This is probably a consequence 
of the overexpression of full length hMSH-2 being deleterious to the growth 
of the insect virus in cell culture. However, we have shown that fragments 
of hMSH-2 can be successfully expressed in bacteria and we have identified 

15 a domain of the hMSH-2 enzyme which when expressed displays mismatch 
binding activity in vitro. hMS-2 and its homologues are very highly 
conserved over their carboxy terminal domains. We have examined the 
domain and we have found that it demonstrates homology to a type A 
consensus sequence found in many proteins that bind and hydrolyse 

20 nucleotides (10). It has been shown that MutS displays a weak ATPase 
activity in the presence and absence of DNA and genetic alteration of this 
ATP binding site results in a protein which is defective in mismatch repair. 
It is therefore notable that we demonstrate herein that the aforementioned 
domain of hMSH-2 also exhibits ATPase activity. 



25 It is therefore an object of the invention to provide a protein fragment that 
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shows mismatch binding activity in vitro. 



According to a first aspect of the invention there is provided an isolated 
polypeptide showing mismatch nucleotide binding activity in vitro said 
polypeptide comprising at least a part of the £ terminal domain of a 
nucleotide binding protein, or a type A nucleotide binding motif, which 
domain, or motif, further exhibits ATPase activity. 



Reference herein to a type A nucleotide binding motif includes reference to 
a motif that has been identified following structural studies and shown to 
comprise a type A sequence including a flexible loop bounded by a p sheet 
10 with an a helix on either side (22-26). 

Ideally the polypeptide is a part of an enzyme whose functions in vitro is the 
recognition of mismatches such as a proof-reading enzyme or repair enzyme 
and ideally a C-terminal domain of said enzyme. 



More preferably still said polypeptide comprises approximately 300, and more 
preferably 297, amino acids and ideally the last 270 amino acids of said 
enzyme. 

More preferably further still said polypeptide comprises amino acids 637 to 
877 of the protein hMSH-2. Even more preferably still said polypeptide 
comprises amino acids 664 to 877. More preferably further still said 
polypeptide comprises amino acids 664 to 805 of hMSH-2. 

More preferably yet still said polypeptide comprises the amino acids shown 
in the alignment sequence of Figure 1, or at least a substantial part thereof, 
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which part whose function in vitro is the recognition of mismatch binding as 
herein broadly described. 

More preferably still said polypeptide comprises the nucleotide binding 
domain of hMSH-2, or a homologue or analogue thereof, comprising lysine 
5 at nucleotide position 675. 

According to a second aspect of the invention there is provided an expression 
system for the manufacture of a protein fragment in accordance with the 
invention which system comprises a host cell comprising a fragment of DNA 
encoding the protein fragment of the invention which DNA is functionally 
10 coupled to the replication system of the host cell whereby the protein 
fragment of the invention can be made. 

According to a third aspect of the invention there is provided a vector for 
transforming a host cell whereby the protein fragment of the invention can 
be made. 



15 According to a fourth aspect of the invention there is provided a method for 
obtaining the protein fragment of the invention comprising: 

a) inserting a fragment of DNA encoding the protein fragment of 
the invention and any necessary transcriptionalAranslational 
control elements into a suitable host cell expression system; 

20 b) providing conditions which favour transcription and translation 

of said DNA in said host cells; 
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c) harvesting said host cells; 



d) lysing said host cells; 

e) collecting the lysate; purifying the fragment. 

In each of the above expression systems and methods the host cell is ideally 
5 a bacterial host cell. 

Although obtaining the protein fragment of the invention has been described 
with reference to a biological expression system the said fragment may also 
be synthetically produced. 

In yet a further preferred aspect of the invention said protein fragment is 
3 provided with a tag for example a C-terminal Flag peptide such as (Asp Tyr 
Lys Asp Asp Asp Asp Lys). However, any other tag such as a fluorescent 
marker or radioactive marker may be used. 

In the instance where said Flag is used an antibody, ideally monoclonal, 
which specifically binds to the Hag can be used to identify the fragment. 

'< According to a further aspect of the invention there is provided DNA 
sequence encoding the protein fragment of the invention. 

According to a further aspect of the invention there is provided 
oligonucleotides for amplifying said DNA. 

According to a yet further aspect of the invention there is provided a method 
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for identifying mismatched oligonucleotides comprising exposing strands of 
oligonucleotides to the protein fragment of the invention under conditions 
which promote binding; and determining the amount of binding taking place. 

If preferred, the said oligonucleotides can be tagged using, for example, a 
5 radio label. 

According to a yet further aspect of the invention there is provided a kit for 
determining mismatch binding comprising at least the protein fragment of the 
invention. 

Ideally said kit comprises a control comprising at least one mismatched 
10 binding pair of oligonucleotides and ideally at least one matched 
complementary binding pair of oligonucleotides. 

According to a yet further aspect of the invention there is provided the use 
of a fragment of a nucleotide binding protein for detection of mismatched 
complementary oligonucleotide pairs or of mismatches in double-stranded 
15 nucleic acid fragments or in double-stranded PCR products. 

According to a yet further aspect of the invention there is provided means for 
regulating the activity of a nucleotide binding protein, or a fragment thereof, 
comprising the substitution, or deletion, of a critical codon, or amino acid, in 
the nucleotide binding domain thereof. 

20 Preferably, said substitution or deletion, comprises a manipulation of: the 
codon encoding the amino acid lysine at codon 675 of hMSH-2, or its 
equivalent in a homologous or analogous protein; or the corresponding lysine 
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amino acid, so that lysine is either substituted or deleted in the relevant 
protein, or a fragment thereof. 

An embodiment of the invention will now be described by way of example 
only with reference to the following figures, materials and methods wherein: 

Figure 1 Shows alignment of amino acid sequences of the conserved 
COOH terminal region of hMSH-2, and MSH-2 and MutS. 

Figure 2 Shows how a DNA fragment containing the carboxy terminal 
domain of hMSH-2 was generated using PCR. This fragment 
contained amino acids 611 to 852 of the published sequence 
(ii). The domain was ligated to pFlag.CTC to derive 
phMSH2.Flag. 



Figure 3 Shows analysis of hMSH-2 flag fusion protein. 

Transformed E. coli were grown at 37° C for 2 hours, the 
cultures were grown for a further 5 hours (Lane 1), or induced 
with IPTG (1 mM) and grown for 0 hours (Lane 2), 2 hours 
(Lane 3) and 5 hours (Lane 4). Extracts were resolved by SDS 
PAGE, transferred to nitrocellulose, and incubated with M2 
monoclonal antibody (IgG,) to the flag epitope and immune 
complexes were detected by using rabbit ami mouse 
immunoglobulin conjugated with horseradish peroxidase. The 
nitrocellulose membranes were developed in PBS containing 
0.02% l-chloro-4-naphtol and 0.006% hydrogen peroxide. 
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Figure 4 Shows ATPase analysis of the bacterial fusion. Hydrolysis of 
various substrate concentrations of [a- 32 P]ATP by the carboxy 
terminal domain of hMSH-2 was assayed by thin layer 
chromatography, and quantified using a scintillation counter. 

5 Figs. 5&5A Show functional analysis of the bacterial fusion protein. 

Oligonucleotides containing either a perfect match a selected 
single mismatch were radiolabeled using primer extension. 
One pmole of labelled DNA was incubated for 1 hour with: 
Figure 5 £. coli MutS (Lane 1), or protein extracts of Flag 
10 (Lane 2), or hMSH-2.Flag (Lane 3-4); or Figure 5A protein 

extracts of hMSH-2.Flag or Flag. After the incubation period 
the mixtures were slowly filtered over preset nitrocellulose, 
washed and bound DNA was detected using autoradiography. 

Figs 6&6A Shows quantification of the binding assay. 

15 Figures 6 and 6A correspond to Figures 5 and 5A, respectively. 

Radioactive spots on the nitrocellulose filter were excised and 
quantified using a scintillation counter. The results are shown 
as counts per minute and variations between three replicated 
assays are indicated. 

20 Figure 7 Homology between hMSH-2 DNA binding domain and the 
Type A' consensus sequence. Bold type indicates conserved 
residues between both proteins. The mutants produced are 
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indicated showing the alteration of each conserved residue. 

Figure 8 Analysis of expression of mutant fusion proteins. Coomassie 
blue stained SDS PAGE gel showing protein extracts analysed 
4 hours post induction with ImM IPTG. Lane 1-2, hMSH-2 
domain - uninduced and induced; Lane 3-4, A 1 - uninduced 
and induced; Lane 5-6, A2 - uninduced and induced; Lane 7-8, 
A3 - uninduced and induced; Lane 9-10, A4 - uninduced and 
induced; Lane 1 1-12, A5 - uninduced and induced, respectively. 

Figure 9 ATPase analysis of the mutant bacterial fusion proteins. 

Hydrolysis of various substrate concentrations of [a- 32 -P]ATP 
by the carboxy terminal domain of hMSH-2, pET and A 1-5 
were assayed by thin layer chromatography, and quantified 
using a scintillation counter. 

Figure 10 Functional analysis of the mutant bacterial fusion proteins. 

Oligonucleotides containing either a perfect match or a range 
of single mismatches were radiolabelled using polynucleotide 
kinase. One pmole of labelled DNA was incubated for 1 hour 
with protein extracts of hMSH-2, pET and A 1-5. After the 
incubation period the mixtures were slowly filtered over prewet 
nitrocellulose. Washed and bound DNA was detected using 
autoradiography. 

Materials and Methods 



Construction of a hMSH-2 C-terminal domain expression vector 
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A DNA fragment containing the cDNA fragment required from hMSH-2 was 
generated by polymerase chain reaction (PCR) using lOng of plasmid 
pBShMSH2 DNA, and 250ng of the oligonucleotides dCCG AAG CTT AGG 
CAT GCT TGT GTT GAA GTT CAA GAT and dGCG GGA TCC TGT 
5 TTC CAG ATA GCA CTT CTT TGC TGC. These oligonucleotides 
incorporated BamYH and HindOl restriction sites respectively, for convenient 
cloning of the PCR product. They generate a PCR fragment which encodes 
amino acid 637 to amino acid 877 in the published sequence (11). The 
reaction was performed with 4 units of Taq polymerase (Promega) in the 
10 buffer recommended by the supplier. After 30 cycles (1 minute, 92° C, 1 
minute 60° C, 1 minute 72° C), the DNA produced was phenol/chloroform 
extracted, ethanol precipitated, digested with BamHl and Hindlll and cloned 
into the corresponding sites of pFlag.CTC (IBI) to derive phMSH2.Flag. The 
integrity of the insert was checked by DNA sequencing (data not shown). 

15 Expression of hMSH-2 C-terminal domain as a bacterial fusion protein 

The PCR product encoding the 637 to 877 amino acid hMSH-2 domain in the 
Flag bacterial expression vector (IBI), was used to transform £. coli strain 
DH5oc. A fresh overnight culture of transformed £. coli was diluted 1 in 20 
with LB medium containing ampicillin (100 yg/ml). After growth at 37° C 

20 for 2 hours, the culture was induced with IPTG (1 mM) and grown at 37° C 
for a further 5 hours. The cells were harvested by centrifugation at 3200g for 
10 minutes and resuspended in 0. 1 volume lysis buffer ( lOOmM Tris-HCl, pH 
8.0, ImM EDTA) and incubated on ice with 3 mg/ml of lysozyme for 30 
minutes. The cells were then sonicated and lysed by the addition of Tween 

25 20 lysis buffer (100 mM Tris-HCI, pH 8.0, 200 mM NaCI, 1 mM EDTA, 0.3 
mg/ml phenylmethylsulphonyl fluoride, 0.8 jig/ml pepstatin, 1 mM DTT, 1% 
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Tween 20). Cellular debris was pelleted by centrifugation at 4,000g. 
Detection of fusion protein by Western blot analysis 

Protein extracts were mixed with 2x reducing sample buffer (50mM Tris-HCI, 
pH 6.8, 4% sodium dodecyl sulphate (SDS), 5 mM EDTA, 10% p.' 
mercapthoethanol, 1 mM DTT and 0.01% bromophenol blue). After boiling 
for 3 minutes, samples were fractionated on a 12% SDS polyacrylamide gel. 
After electrophoresis the ge] was soaked for 10 minutes in transfer buffer (25 
mM Tris, 192 mM glycine, 20% methanol (v:v), and 0.1% SDS), and the 
proteins were transferred to nitrocellulose membranes by electroblotting for 
3 hours at 250 mA. After transfer, the membranes were soaked in PBS and 
incubated for 2 hours in blocking buffer (PBS containing 5% nonfat dry 
milk). Membranes were incubated with a 1/100 dilution of the M2 
monoclonal antibody (IgG„ IBI), washed with PBS and incubated for 1 hour 
at 37°C with a 1/1000 dilution of rabbit anti-mouse immunoglobulin 
conjugated with horseradish peroxidase in blocking buffer. After five washes 
with PBS the nitrocellulose membranes were developed in PBS containing 
0.02% l-chloro-4-naphthol and 0.006% hydrogen peroxide. 

ATPase Assay 

The assay was performed at 37°C in 20 mM Tris-HCI pH 7.6, 0.5 mM 
CaCl 2 , 5mM MgCl,, 1 mM DTT, 100 /ig/ml BSA, 0.1 mM EDTA and 150 
ng of hMSH-2 domain. Assays were performed using 2, 2.5, 3.3, 5 and 10 
m ATP. Hydrolysis of [«-»P]ATP by the carboxy terminal domain of 
hMSH-2 was assayed by thin layer chromatography. The radioactive counts 
for ATP and its hydrolysis products were quantified using a scintillation 
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counter (Packard). 
Functional binding assay 

Mismatch binding was detected by a nitrocellulose binding assay of labelled 
oligonucleotides followed by autoradiography. Oligonucleotides (dCGG ATC 
5 CGG AXG TCA TGG AAT TCC and dGGA ATT CCA TGA CXT CCG 
GAT CCG) were synthesised and annealed to produce either a perfect 
matched double-stranded molecule or a single G:T mismatch (position shown 
in bold type). Oligonucleotides were mixed to a final concentration of 100 
pmole/nl each in 100 ^1 STM (100 mM NaCI, 10 mM Tris-HCI, pH 7.0, 10 

10 mM MgCI 2 , 5 mM DTT) heated to 95° C and cooled to 25° C over 2 hours. 
The annealed products were then stored in 50% glycerol at -20° C until 
required. End-labelling of double-stranded DNA (100 pmole) in STM buffer 
was by polynucleotide kinase. After incubating at 20° C for 10 minutes the 
unincorporated label was removed using a Sephadex NAP 5 column. The 

15 labelled DNA was diluted to 0.2 pmole/jiL The binding assay used 1 pmole 
of DNA with 150 ng hMSH-2 domain in a total volume of 10 /A. After 1 
hour on ice the mixture was slowly filtered over pure prewetted nitrocellulose 
(Millipore, 0.45 \im) and washed in STM buffer. The filter was then allowed 
to air dry and bound DNA was detected by autoradiography. Bound material 

20 was quantified using a scintillation counter (Packard). 

Construction of mutant hMSH-2 nucleotide binding domain expression vectors 

DNA fragments expressing the C-terminal DNA binding domain sufficient 
to bind specific mismatched oligonucleotides and mutants 1-5 shown in 
Figure 7 were generated by polymerase chain reaction (PCR) using lOng of 
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plasmid pBShMSH-2 DNA, and 250ng of the following forward primers 
dCCG GGA TCC TTC CAC ATC ATT ACT GGC CCC AAT ATG GGA 
GGT AAA TCA; dCCG GGA TCC TTC CAC ATC GGT ACT GGC CCC 
AAT ATG GGA GGT AAA TCA; dCCG GGA TCCTTC CAC ATC ATT 
ACT GCC CCC AAT ATG GGA GGT AAA TCA; dCCG GGA TCC TTC 
CAC ATC ATT ACT GGC CCC AAT ATG GGA GCT AAA TCA; dCCG 
GGA TCC TTC CAC ATC ATT ACT GGC CCC AAT ATG GGA GGT 
GCA TCA; dCCG GGA TCC TTC CAC ATC ATT ACT GGC CCC AAT 
ATG GGA GGT AAA GCA and the reverse primer dGCG GGA TCC TCT 
TTC CAG ATA GCA CTT CTT TGC TGC (changes shown in bold type). 
These oligonucletoides incorporated BamHl restriction sites for convenient 
cloning of the PCR products. The reaction was performed with 4 units of Pfu 
DNA Polymerase (Stratagene) in the buffer recommended by the supplier. 
After 30 cycles (1 min, 92"C, 1 min 60°C, 1 min 72°C), the DNA produced 
was phenol/chloroform extracted, ethanol precipitated, digested with BamHl 
and cloned into the corresponding site of pET21a (Novagen) to derive 
pEThMSH-2 and PETAl-5 respectively. The integrity of each insert was 
confirmed by DNA sequencing (data not shown). 



Production ofhMSH-2 nucleotide binding domain mutants as bacterial Jusion 
20 proteins 



The wild type and mutant proteins encoding the amino acid 663-877 hMSH-2 
domain in the pET bacterial expression vector, were used to transform £. con- 
strain BL21(DE3). A fresh overnight culture of transformed E. coli was 
diluted 1 in 20 with LB medium containing ampicillin (100 ng/ml). After 
25 growth at 37°C for 2 hours, the culture was induced with IPTG (1 mM) and 
grown at 37°C for a further 5 hours. The cells were harvested by 
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centrifugation at 3200g for 10 minutes and resuspended in 0.1 volume lysis 
buffer (lOOmM Tris-HCl, pH 8.0, ImM EDTA) and incubated on ice with 
3 mg/ml of lysozyme for 30 minutes. The cells were than sonicated and 
lysed by the addition of Tween 20 lysis buffer (100 mM Tris-HCl, pH 8.0, 
5 200 mM NaCl, 1 mM EDTA, 0.3 mg/ml phenylmethy [sulphonyl fluoride, 0.8 
H.0, 200 mM NaCl, 1 mM EDTA, 0.3 mg/ml phenylmethy lsulphonyl 
fluoride, 0.8 (ig/ml pepstatin, 1 mM DTT, 1% Tween 20). Cellular debris 
was pelleted by centrifugation at 4,000g. 

Detection of fusion protein by SDS-PAGE 

1 0 Protein extracts were mixed with 2x reducing sample buffer (50mM Tris-HCl, 
pH 6.8, 4% sodium dodecyl sulphate (SDS), 5mM EDTA, 10% 0- 
mercapthoethanol, 1 mM DTT and 0.01% bromophenol blue). After boiling 
for 3 minutes, samples were fractionated on a 12% SDS polyacrylamide gel. 
Following electrophoresis the gel was stained with Coomassie blue solution 

15 (25% v/v isopropyl alcohol, 10% v/v acetic acid and 0.25% w/v Coomassie 
blue). 

ATPase assay 

The assay was performed at 37° C in 20mM Tris-HCl, pH 7.6, 0.5 mM CaCl 2 , 
5mM MgCl 2 , 1 mM DTT, 100 ^g/ml BSA, 0.1 mM EDTA with 150 ng of 
20 wild type or mutant hMSH-2 domains. Assays were performed using 2, 2.5, 
3.3, 5 and 10 \iM ATP. Hydrolysis of [a- 32 P]ATP by the wild type and each 
mutant carboxy terminal domain was assayed by thin layer chromatography. 
The radioactive counts for ATP and its hydrolysis products were quantified 
using a scintillation counter (Packard). 
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Functional binding assay 
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Mismatch recognition was detected by a nitrocellulose binding assay of 
labelled oligonucleotides followed by autoradiography as described previously 
(1). Briefly, oligonucleotides (dCGG ATC CGG AXG TCA TGG AAT TCC 
and dGGA ATT CCA TXA CAT CCG GAT CCG) were annealed to 
produce either a perfect matched double-stranded molecule or a single 
mismatch (position shown in bold type). Oligonucleotides were mixed to a 
final concentration of 100 pmole/ul each in 100 ul STM (100 mM NaCI, 10 
mM Tris-HCI, P H 7.0, 10 mM MgCl 2 , 5 mM DTT (heated to 95°C and 
cooled to 25°C over 2 hours. End-labelling of double-stranded DNA (100 
pmole) in STM buffer was performed with polynucleotide kinase. After 
incubation at 20"C for 10 minutes the unincorporated label was removed 
using a Sephadex NAP 5 column. The labelled DNA was diluted to 0.2 
pmole/u I. The binding assay used 1 pmole of DNA with 150 ng of wild type 
or each mutant hMSH-2 domain in a total volume of 10 ul. After 1 hour on 
ice the mixture was slowly filtered over pure prewetted nitrocellulose 
(Millipore, 0.45um) and washed in STM buffer. The filter was then allowed 
to air dry and bound DNA was detected by autoradiography. 



Results 



20 PCR amplification and Cloning 

Protein sequence alignments of hMSH-2 and its homologues, MutS, MSH-2 
and GTBP revealed a highly conserved region at the COOH terminus (Figure 
1). This region contains a type A nucleotide binding site consensus sequence. 
A 720 bp fragment was amplified using PCR, incorporating BamKl and 
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HindHI restriction sites for convenient cloning. This fragment of the hMSH-2 
cDNA sequence encodes amino acid residues 637 to 877. The PCR product 
was ligated to pFlag.CTC, in phase with respect to the ATG translational start 
codon immediately upstream of the multiple cloning site (MCS) and also in 
5 frame with the C-terminal coding sequence immediately downstream of the 
MCS to ensure proper fusion to the C-terminal Flag peptide (Asp Tyr Lys 
Asp Asp Asp Asp Lys). [Figure 2] 

Expression of the hMSH-2 C-terminal domain 

The hMSH-2 domain was thus cloned into the bacterial expression vector 
10 Flag (IBI). Expression of the hMSH-2 Flag fusion protein resulted in a 30 
kDa species detected by Western blot analysis on SDS-PAGE (Figure 3). 
The anti-Flag M2 monoclonal (IgGl) mouse antibody (IBI) was used to 
specifically bind to the eight amino acid Flag peptide, which identified the 
249 amino acid recombinant protein comprising the hMSH-2 domain 
15 (containing a type A nucleotide binding site consensus sequence) coupled to 
the Flag peptide at its carboxy terminus. 

ATPase Analysis of Bacterial Fusion Protein 

The Walkers A-type nucleotide binding motif conserved in MutS proteins has 
been shown to have ATPase activity (21). In order to determine whether the 
20 carboxy terminal domain of hMSH-2 hydrolyses ATP to ADP and Pi, 
[a32P]ATP was incubated with the fusion protein and separated using TLC. 
To determine K m and values of the hMSH-2 domain, ATPase activity was 
measured in the presence of various concentrations of ATP (Fig. 4). At 37° C 
the and * cat were calculated to be 6.6 /Al and 0.5 s' 1 , respectively. In a 
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control experiment, nonenzymatic hydrolysis of ATP in the absence of the 
expressed domain was less than 5%. 

Functional analysis of the bacterial fusion protein 

A mismatch binding assay was developed to measure the hMSH-2 C-terminal 
5 domain's activity. Mismatch binding was detected by nitrocellulose binding 
of labelled oligonucleotides containing a mismatch at position 11 within the 
context of a double-stranded 24mer oligonucleotide pair. The binding of the 
hMSH-2 domain to an G-T mismatch containing oligonucleotides is shown 
in Figure 5. The binding of the hMSH-2 domain to a range of mismatch 
10 containing oligonucleotides is shown in Figures 5A. Radiolabelled 
oligonucleotides containing a perfect match or a single mismatch were 
incubated with purified MutS or protein extracts containing hMSH-2.Flag or 
Hag alone. The binding of proteins to mismatched or matched 
oligonucleotides was quantified using a scintillation counter (Figures 6 and 
15 6A). In Figure 6 it can be seen that the MutS and hMSH-2 domain 
selectively bound to the oligonucleotides containing the mismatch, but not the 
perfectly matched oligonucleotides. In Figure 6A it can be seen that hMSH-2 
domain selectively bound to the oligonucleotides containing all possible 
mismatches, apart from C/C and A/A mismatches. The flag control bound 
20 to no oligonucleotide, showing the hMSH-2 domain alone is sufficient to bind 
oligonucleotides containing a G/T mismatch. 

PCR amplification and cloning 

A fragment of the hMSH-2 cDNA sequence which encodes amino acid 
residues 637 to 877 has been shown to bind oligonucleotides containing 
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mismatches (27). In order to determine which specific residues are important 
in this domain, mutant proteins have been produced which alter specific 
residues within the putative nucleotide binding region (Fig. 7). DNA 
fragments which encode the nucleotide binding domain of hMSH-2 were 
5 amplified using PCR, incorporating BamHl restriction sites for convenient 
cloning. Each product was ligated to the expression vector pET21a, in phase 
with respect to the ATG translational start codon immediately upstream of the 
multiple cloning site (MCS) and also in frame with the C-terminal coding 
sequence immediately downstream of the MCS to ensure proper fusion to the 
10 C-terminal HisTag. 

Expression of the hMSH-2 nucleotide binding domain mutants 

To confirm their integrity, each mutant hMSH-2 nucleotide binding domain 
was cloned into the bacterial expression vector pET21a. Expression of the 
mutant hMSH-2 fusion proteins resulted in 30 kDa species detected using 
15 SDS-PAGE comprising the hMSH-2 domain (containing a type A nucleotide 
binding site consensus sequence) coupled to the HisTag peptide at its carboxy 
terminus. (Fig. 8). We designate these mutant fusion proteins Al to A5. 
All mutant proteins were expressed at comparable levels to the wild type 
fusion protein. 

20 ATPase analysis of mutant fusion proteins 

It has been shown that the carboxy terminal domain of hMSH02 contains 
ATPase activity (27). In order to determine whether these mutants hydrolyse 
ATP to ADP and Pi, [a32P]ATP was incubated with each mutant fusion 
protein and separated using TLC. To determine and values for the 
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mutants, ATPase activity was measured in the presence of various 
concentrations of ATP (Fig. 9). The results show that Al, A2, A3 have 
limited effects on the ATPase activity of the domain. Wild type K m and k,. 
values were 8.33uM and 0.55 S\ respectively, compared to A2 values of 
5 5.88 uM and 0.633 S ', respectively. However, A5 had a reduced activity 
with K,,, and k^, values of 3.6 uM and 0.65 S 1 , and A4 which alters the 
codon 675 from a lysine to an alanine has a marked effect upon ATP 
hydrolysis, effectively reducing it to zero. In a control experiment, 
nonenzymatic hydrolysis of ATP in the absence of the wild type expressed 
10 domain was less than 5%. 

Functional analysis of the mutant fusion proteins 

A mismatch binding assay was developed to measure the hMSH-2 C-terminal 
domains activity (27). Mismatch recognition was detected by nitrocellulose 
binding of labelled oligonucleotides containing a mismatch at position 11 
15 within the context of a double-stranded 24-mer oligonucleotide pair. We 
found that the wild type C-terminal domain of hMSH-2 selectively bound all 
specific mismatches apart from A/A and C/C, in agreement with results 
described previously (27). The pET control did not bind to any labelled 
oligonucleotide pair. Applying this assay to the mutant proteins, we found 
>0 that A 1, A2 and A 3 bound the same specific mismatches as the wild type 
domain albeit to a somewhat lesser extent. This may be due to the amino 
acid substitutions reducing recognition of the mismatches or a reduced 
affinity of these proteins once bound to a mismatch resulting in separation 
from the mismatch in the washing procedures of the assay. A5 which alters 
Ser 676 to an Ala has further reduced affinity for these mismatches and A4 
was found to have no selective binding to any of the specific mismatches 
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(Fig. 10). 
Discussion 

hMSH-2 and its homologues around the domain of the type A nucleotide 
binding motif, bind mismatched oligonucleotides. Herein we have expressed 
5 this domain as a bacterial fusion protein and shown that mismatch-containing 
. oligonucleotides are selectively bound. 

The inability of A 4, which alters the codon 675 from a lysine to an alanine, 
to identify DNA containing mismatches suggests that this Lys 675 residue is 
important for the binding function. It is unlikely that the mutation alters the 

10 structure of this domain significantly so as to reduce stability in E. coli as the 
expression level of this mutant is comparable to that of wild type hMSH-2 
nucleotide binding domain. Thus the deficiency is not due to a gross 
structural instability. At present the role of Lys 675 within the nucleotide 
binding site is hMSH-2 is not known. However, similar motifs in other 

15 proteins have been analysed. Structural studies have shown that a 'type A' 
sequence is a flexible loop bounded by a P sheet with an a helix on either 
side (22-26). This flexible loop allows the protein to undergo conformational 
change, thus controlling the accessibility of substrate binding or binding site 
affinities (27). Further studies have shown that an analogous lysine plays an 

20 important role in ATP-dependent function of these proteins (27-28). 

The key role of Lys 675 is also emphasised by the fact that mutations at 
residues 666, 668 and indeed 674 (the residue next to the critical lys residue) 
have minimal effects on mismatch recognition and ATPase activity. 
Furthermore, mutation of the conserved Ser 676, the residue immediately C- 
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terminal of Lys 675 still provides a protein with retains 40% of its normal 
activity. These observations suggest that structural factors alone may not 
fully explain the importance of Lys 675 and perhaps this basic cationic 
residue is involved more directly, for example in recognising the phosphate 
backbone at the mismatch point in mispaired DNA. 
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CLAIMS 

1 . An isolated polypeptide showing mismatch nucleotide binding activity 
in vitro, said polypeptide comprising at least a part of the C terminal domain 
of a nucleotide binding protein, or a type A nucleotide binding motif, which 

5 domain, or motif, further exhibits ATPase activity. 

2. A polypeptide according to claim 1 wherein said enzyme is hMSH-2 
or a homologue or analogue thereof. 

3. A polypeptide according to claims 1 or 2 wherein said fragment 
comprises no more than the last 300 amino acids of said enzyme. 

10 4. A polypeptide according to claim 3 wherein said fragment comprises 
no more than the last 270 amino acids of said enzyme. 

5. A polypeptide according to any preceding claim which comprises 
amino acids 637-877 of the protein hMSH-2. 

6. A polypeptide according to claim 2 wherein said fragment comprises 
15 amino acids 644-877 of the protein hMSH-2, 

7. A polypeptide according to claim 2 wherein said fragment comprises 
amino acids 664-805 of the protein hMSH-2. 



20 



8. A polypeptide according to any preceding claim wherein said 
polypeptide is provided with a tag so as to enable the binding of same to be 
detected. 
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9. A polypeptide according to claim 8 wherein said tag is a ^-terminal 
flag peptide. 

10. A polypeptide according to claim 9 wherein said tag further includes 
an antibody adapted to bind to said £-terminal flag peptide. 

11. An expression system for the manufacture of a polypeptide according 
to any of the proceeding claims which system comprises a host cell including 
a fragment of DNA encoding the polypeptide according to any of the 
preceding claims which DNA is functionally coupled to the replication 
system of the host cell whereby said polypeptide can be made. 

10 12. A vector for transforming a host cell whereby the polypeptide 
according to any one of claims 1-10 of the invention can be made. 

13. A method for obtaining a polypeptide according to claims 1 to 10 
comprising; 



15 



a) inserting a fragment of DNA encoding the said polypeptide and 
any necessary transcriptional/translational control elements into 
a suitable host cell expression system; 



b) providing conditions which favour transcription and translation 
of said DNA in said host cell; 

c) harvesting said host cells; 

d) lysing said host cells; 



WO £7/15657 



PCT/GB96/02595 



29 

e) collecting the lysates; and 

f) purifying the polypeptide fragment. 

14. A DNA sequence encoding the polypeptide according to any one of 
claims 1-10. 

5 15. Oligonucleotides for amplifying the DNA encoding the polypeptide 
according to anyone of claims 1-10. 

16. A method for identifying mismatched oligonucleotides comprising 
exposing strands of oligonucleotides to the polypeptide of any one of claims 
1-10 under conditions which promote binding and determining the amount of 

10 binding taking place. 

17. A kit for determining mismatch binding comprising at least a 
polypeptide according to any one of claims 1-10. 

18. A kit according to claim 17 which further comprises a control 
including at least one mismatch binding pair of oligonucleotides. 

15 19. A kit according to claims 17 or 18 comprising at least one matched 
complementary binding pair of oligonucleotides. 

20. The use of a polypeptide according to any one of claims 1-10 for 
detection of mismatched complementary oligonucleotide pairs or of 
mismatches in double-stranded nucleic acid fragments or in double-stranded 
20 PCR products. 
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21. Means for regulating the activity of a nucleotide binding protein, or a 
fragment thereof, comprising the substitution, or deletion, of a critical codon, 
or amino acid, in the nucleotide binding domain thereof. 

22. Means according to claim 21 wherein said regulation comprises a 
manipulation of: the codon encoding the amino acid lysine at codon 675 of 
hMSH-2, or its equivalent in a homologous or analogous protein, or the 
corresponding lysine amino acid, so that lysine is either substituted or deleted 
in the relevant protein, or a fragment thereof. 
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